Decoding spatiotemporal features of emotional body language in social interactions

How are emotions perceived through human body language in social interactions? This study used point-light displays of human interactions portraying emotional scenes (1) to examine quantitative intrapersonal kinematic and postural body configurations, (2) to calculate interaction-specific parameters of these interactions, and (3) to analyze how far both contribute to the perception of an emotion category (i.e. anger, sadness, happiness or affection) as well as to the perception of emotional valence. By using ANOVA and classification trees, we investigated emotion-specific differences in the calculated parameters. We further applied representational similarity analyses to determine how perceptual ratings relate to intra- and interpersonal features of the observed scene. Results showed that within an interaction, intrapersonal kinematic cues corresponded to emotion category ratings, whereas postural cues reflected valence ratings. Perception of emotion category was also driven by interpersonal orientation, proxemics, the time spent in the personal space of the counterpart, and the motion–energy balance between interacting people. Furthermore, motion–energy balance and orientation relate to valence ratings. Thus, features of emotional body language are connected with the emotional content of an observed scene and people make use of the observed emotionally expressive body language and interpersonal coordination to infer emotional content of interactions.

interactions might be embodied synchronization or proxemic measures such as distance and orientation [27][28][29] . Taken together, the aforementioned studies suggest that interaction-specific parameters also contribute to the perception and identification of emotions. However, up to now, it remains largely unknown which features drive emotion perception in social interactions on the intra-and interpersonal level.
Here, we investigate for the first time both levels of body features in social interactions and their influence on the perception of emotions from body language. We provide a quantitative description and computational framework of movement features in social interactions using univariate and multivariate analysis. In detail, we investigated intrapersonal EBL by computing several kinematic and postural features and relating them to emotion perception. Moreover, we focused on interaction-specific characteristics that contribute to emotion perception. We used 48 point-light displays (PLDs) of human interactions portraying four emotions (happiness, affection, sadness, and anger). Participants observed these stimuli and were asked to categorize both the depicted emotional content and the valence of the perceived stimulus. We quantified different intra-as well as interpersonal movement features and analysed differences between emotional categories. To evaluate the relative importance of each calculated feature in the classification of emotional content, we trained different decision tree classifiers. Finally, we explored the correspondence of both the perceived emotional content and the perceived valence of a scene to the computational features on intra-and interpersonal levels via representational similarity analysis (RSA).

Materials and methods
Participants. A total of 31 participants (16 women) with a mean age 23.58 ± 3.54 years participated in the experiment. None reported any history of psychiatric or neurological disorders and they had no history or current use of psychoactive medication. All procedures were approved by the local ethics committee of the Department of Psychology and Sports Science of the Justus Liebig University Giessen and adhered to the declaration of Helsinki. All participants provided written informed consent prior to participating.
Stimuli. Stimuli were selected from a larger motion-capture data set 17 . Eight pairs of non-professional actors were instructed to perform an interaction portraying one out of four emotional scenes depicting either happiness, affection, sadness, or anger. To ensure a congruent behavioural pattern, actors were given a script of emotional situations and directed specifically to perform the same emotion. They were instructed to express their emotions intuitively within the context of the given situation, thereby allowing freedom and enhancing the variability of expression 17 . Interactions were recorded with an optical motion capture system (Vicon Motion Systems, Oxford, England) operating at 100 Hz. MATLAB software (Mathworks, Natick, MA) was used to create video files of 4-s sequences from the original coordinate 3D (C3D). In each video, 15 markers per person were plotted as white spheres on a black background to present a standard PLD model 30 .
The final stimulus selection was based on prior validation of emotion category and perceived valence from 24 participants who did not take part in the present experiment. Valence was judged on an 11-point scale ranging from − 5 (extremely negative) to + 5 (extremely positive). There were two validation criteria: first, at least 50% of the participants had to recognize the displayed emotion (e.g., anger); second, the second-highest emotion rating should not exceed 25%. This allowed us to identify and exclude ambiguous scenes in which a specific emotion could not be recognized reliably. After validation, 12 stimuli that met both criteria were selected randomly for each emotion category. This resulted in a set of 48 (4 emotions × 12 scenes) stimuli. For more information on stimulus creation and validation, see Supplementary Figs. S1, S11 and 17 .
Experimental procedure. Prior to the present experiment, participants were given instructions and acquainted with the task. They subsequently performed a test run containing 12 trials that were not included in the main experiment.
In the experiment, each sequence was presented once, resulting in a series of 48 sequences. Sequences were displayed in a pseudo-randomized order on a 12-in. screen (refresh rate 60 Hz). The distance between each test person's eyes and the screen was approximately 40 cm. Each trial started with a fixation phase (1 s) followed by a stimulus sequence (4 s) and two behavioural ratings. After observing this sequence, participants were asked to assess the emotional valence of the videos on the same scale that had been used for stimulus validation (7 s). The second step was to sort emotions into one of the following categories: happiness, affection, sadness, or anger (4 s) (Fig. 1A).

Feature definition.
To investigate EBL characteristics that drive the perceptual judgement on an intra-and interpersonal level, we calculated several features using MATLAB software. From the 15 markers displayed, we chose 13 anatomical points (excluding sternum and sacrum) that presented anatomical landmarks on the upper body (including shoulders, elbows, wrists, and head) and the lower body (including hips, knees, and ankles). Features were calculated from the x, y, and z coordinates.
On an intrapersonal level, the three kinematic features (calculated for each anatomical point) addressed velocity, acceleration, and vertical movement. We implemented symmetry, limb angles (shoulder, elbows, hips, knees), limb contraction (distance from head to wrist and ankles), volume, as well as its standard deviation (volume STD) as postural features 9,24 . Each feature was calculated within each of the 400 frames and averaged across time and actors.
In a next step, we computed 12 interpersonal parameters. Proximity measures included interpersonal distance (IPD) and its variance over time (IPD STD), the percentage of time spent in the personal space of the other agent (personal space), as well as interpersonal orientation (IPO) and the ratio of orientation from one person to another to detect imbalances (IPO balance) in which the persons are turned towards each other 9 www.nature.com/scientificreports/ investigate how the spatial distance between two people affects velocity, acceleration, limb angles, and limb contraction (with included time information), we correlated these measures with the distance profile (distance correlations). We also examined the synchronization of the velocity and acceleration profiles (synchronization velocity & acceleration) 29,33 . Finally, we calculated the proportion of the displayed motion energy (motion-energy balance) of each person 9,20,34 . For more detailed information on feature definitions and calculations, see Table 1, supplementary information, and 35 .
Data analysis and statistics. As a first step, we calculated the recognition rates (accuracy) of stimuli for each emotional category by comparing the target emotion with the behavioural response. To ensure a sufficient degree of stimuli recognizability, we tested each emotional category against chance (25%) using Bonferronicorrected one-sample t tests.
Influence of emotional categories. We tested for the emotion specificity of EBL features with a one-way ANOVA. The intrapersonal and interaction-specific features calculated from each stimulus were averaged across anatomical points and used as input. The ANOVA contained a four-level factor of emotion (happiness, affection, sadness, anger). Alpha was set at 0.05 for all statistical tests and post hoc pairwise comparisons were Bonferronicorrected. Due to violations of the normal distribution in the values of interaction-specific features (distance correlation, synchronization), we normalized our data with a Fisher Z transformation 37,38 . To minimize the influence of randomly splitting the displayed 48 stimuli into the training and the validation dataset, we used leave-one-out-cross validation to estimate the performance of the different classifiers. To avoid imbalanced datasets and hence bias, each category was presented equally in training and test data (leave one stimulus out per category). For more information, see Supplementary Fig. S2.
Representational similarity analysis. We used representational similarity analysis (RSA) 41,42 to characterize the relationship between the perceptual ratings and computed EBL feature sets for each of the 48 stimuli. By relating the stimuli to each other and arranging the values horizontally and vertically in the same order, we created a symmetrical representational dissimilarity matrix (RDM) (48 × 48). Each entry describes the relation between two stimuli. In the main diagonal, the stimuli values are compared with themselves, resulting in a diagonal defined as zeros.
In a first step, we created two different model RDMs by assuming a categorical distinction between the emotion and the valence category of the stimuli. Therefore, the dissimilarities between identical categories were 0 and those between different categories were 1 (Fig. 1B).
Second, we calculated 31 individual single-subject RDMs for emotion categorization by also using binary variables (0 if identical emotional rating, 1 otherwise). Furthermore, we used individual valence ratings to create RDMs in which each cell corresponded to the pairwise absolute difference. Here and in the following step, we used the Euclidean distance measure (Fig. 1B) 24,35 .
To test which of the features related to the geometry of the model RDMs and the behavioural rating RDMs, we built feature RDMs representing the intrapersonal and interpersonal level (Fig. 1B). This step resulted in eight intrapersonal RDMs and 12 interpersonal RDMs.
To describe and test the relationship between all RDMs, we calculated a matrix of pairwise correlations (Kendall's τ A ) between model and feature RDMs separately on the intrapersonal and interpersonal level. To Table 1. Summary of interaction-specific intrapersonal and interpersonal features calculated with the SAMI toolbox 35 . For a detailed explanation, see supplementary information. www.nature.com/scientificreports/ account for multiple testing, we applied Bonferroni corrections based on the number of features in each set. We used multidimensional scaling (MDS) to gain a graphical impression of representational distances (computed as 1 − Kendall's τ A ). Furthermore, each feature RDM was tested against the behavioural RDMs using Kendall's τ A for emotion categorization and Pearson correlation coefficients for valence ratings. Multiple testing was Holm-Bonferroni corrected, and the false-discovery rate was set at 0.05. The variance within the emotions and valence ratings across participants was represented by the noise ceiling and determined the amount of variance a model could explain.
In the last step, we aimed to explore perceptual judgements by merging the intra-and interpersonal level, analogous to M3. Therefore, we focused on the feature that best explained the behavioural rating on both levels and additionally outperformed the remaining features in pairwise comparisons. We normalized the representational geometry and created a common feature space by averaging the corresponding RDMs (Fig. S3). Next, we investigated the relationship between the produced feature combination RDM and single-participant behavioural RDMs and tested the resulting model against all other feature RDMs in the same manner as described above.
To calculate features and perform data analysis we used the SAMI toolbox, which is available on Github and archived in Zenodo 35 . Feature-based discrimination between emotion categories. On the intrapersonal level, the kinematic feature velocity revealed a significant main effect of emotion category. Bonferroni-corrected post hoc pairwise comparisons showed significantly faster movements for happiness compared to affection and sadness as well as for anger compared to sadness. Vertical movement also presented a significant main effect of emotion category: happiness was associated with more vertical displacement than anger, affection, and sadness. Volume average was significantly higher for happiness and anger than for sadness. The same was found for volume STD in which happiness and anger interactions were depicted through higher variance in volume than sadness.

Results
For the interpersonal features, we found a significant main effect for IPD showing that the distance between two people was smaller when affection was expressed compared to happiness and anger. Likewise, IPD STD revealed smaller variability while expressing sadness compared to affection.
Examining distance correlation features (relation between IPD and intrapersonal features) revealed that IPD was associated more strongly with limb contraction when expressing affection compared to anger. The distance between interacting people affected volume to a higher degree when showing affection compared to anger.
A further main effect of emotion category was revealed for personal space. Personal space differed significantly between affection and happiness and between sadness and anger, showing that interacting people spent significantly more time in the personal space of their counterpart while expressing affection. Additionally, IPO revealed a significant main effect of emotion category showing that actors turned more towards each other while expressing affection compared to happiness, sadness, or anger. Regarding the motion-energy balance, we found a significant main effect of emotion revealing a lower motion-energy balance for sadness and anger compared to happiness and affection. Finally, balance in the time facing each other showed a main effect of emotion category with the highest IPO balance being for interacting agents portraying affection compared to sadness and anger. All results of the conducted ANOVAs can be found in Table 2 Fig. 2A). M2 showed that IPD and motion-energy balance were the most relevant features for classification on an interpersonal level (Fig. 2B). The combination model (M3) revealed the highest importance of vertical movement, velocity, IPD, IPO, and motion-energy balance (Fig. 2C).

Representational similarity analysis: relatedness of perceived emotions and EBL features.
To determine the relationship between the perceptual impression and EBL features, we carried out an RSA. The visual comparison between the model RDMs (Fig. 3A) and the average rating RDMs (Fig. 3B) revealed a high structural similarity. In a first step, we compared model RDMs (Fig. 3A) and feature RDMs on the intrapersonal and interpersonal levels (Fig. 3C,D). Representational distances (computed as "1 − Kendall's τ A correlation") of the categorical and feature RDMs are depicted via MDS plots. Visual inspection of the intrapersonal MDS plot (Fig. 4A) showed a clear separation between kinematic and postural features. Within the interpersonal RDMs (Fig. 4B) motion-energy balance was located closest to emotion and valence category RDMs. www.nature.com/scientificreports/ Feature RDMs of vertical movement, velocity, limb angles, limb contraction, and volume & volume STD correlated positively with the emotion category model RDM. Limb angles and limb contraction also correlated positively with the valence model RDM. Regarding interpersonal features, we found weak positive correlations between IPO balance, IPD, personal space, IPO, and motion-energy balance and the emotion category model RDM; as well as between IPO Balance, motion-energy balance, and IPD and the valence model RDM (Fig. 4A,B).
Second, we determined the relatedness between EBL features and perceptual impressions by correlating emotion-and valence-rating RDMs and intra-and interpersonal model RDMs. Regarding the relationship between perceived emotion and intrapersonal features, we found significant correlations for all kinematic and postural parameters except acceleration (Fig. 5A). The highest correlations were for vertical movement (r = 0.1) and velocity (r = 0.08). It has to be noted that all correlations were rather low ranging from 0.01 to 0.1. Nevertheless, it is worth mentioning that vertical movement performed better than the remaining features as revealed by pairwise comparisons between the feature RDMs (Fig. 5A). None of the feature RDMs came close to the noise ceiling (0.29-0.31).
When comparing intrapersonal features with valence ratings, we identified significant correlations for each kinematic and postural feature ranging from 0.03 to 0.14. Data revealed that postural parameters performed better than kinematic parameters. As revealed by pairwise comparisons, limb angles correlated most strongly (r = 0.12) with valence ratings and performed significantly better than all other models (Fig. 5C). The second strongest correlation (r = 0.08) was found for limb contraction, which additionally outperformed all kinematic features. Hence, kinematic intrapersonal EBL features related more strongly to the perceived emotion category, and postural intrapersonal EBL features related more strongly to perceived valence.
Regarding the comparison between interpersonal features and valence ratings (Fig. 5D), the highest explanatory value was provided by IPO balance (r = 0.18). This also outperformed all other models (p < 0.001) with the exception of motion-energy balance (r = 0.18). Except for the four distance correlation RDMs, all interpersonal   www.nature.com/scientificreports/ features attained weak significant correlations with valence ratings. Thus, emotion and valence perception of interacting people seems to depend most strongly on the displayed motion-energy balance and orientation as well as on proxemic measures (IPD, IPO, personal space). Furthermore, we conducted an explorative analysis of feature combinations (Fig. 3E). Regarding emotion perception, we averaged vertical movement with each of the six highest performing interpersonal features (IPO balance, personal space, motion-energy balance, IPO, IPD, DC LC). Only feature combinations between vertical movement and IPO (r = 0.11) as well as between vertical movement and motion-energy balance (r = 0.11) performed significantly better than the remaining combination models and all intra-and interpersonal models (p < 0.001) except for the combination between vertical movement and IPO balance. This indicates that emotion perception of EBL was best predicted not by a single feature in isolation, but by a combination of several features.
Regarding valence perception, averaging limb angles and IPO balance (r = 0.21), as well as limb contraction and IPO balance (r = 0.2) revealed higher correlations. Furthermore, pairwise comparisons revealed significant differences between all combination RDMs and feature RDMs on both levels (p < 0.001), except for the combination between limb angles and motion-energy balance as well as the single feature motion-energy balance. For more information, see Supplementary Figs. S9, S10 and Supplementary Table 2.

Discussion
Our data provide a detailed quantitative description of movement features in emotional interactions that are related to emotion perception. The systematic decomposition of an interaction into an intrapersonal and interpersonal level reveals that both levels relate substantially to the emotional content of the scene as well as to  www.nature.com/scientificreports/ its perception. We show that the emotional content of social interactions has a specific kinematic and postural fingerprint and can be described via quantitative intra-and interpersonal parameters. Both levels are linked to each other inseparably. This linkage is reflected not only by a model that integrates intra-and interpersonal features (M3) exhibiting the best performance but also by the explorative analysis of feature combinations. We further show a strong correspondence between those features that characterize the emotional content of a stimulus and the features that are critical for emotion perception 3 . Representational similarity analysis reveals that  www.nature.com/scientificreports/ it is especially kinematic parameters that contribute to the perception of emotional content on an intrapersonal level; whereas on an interpersonal level, balance and proxemics parameters are important cues for the observer. It also becomes apparent that observers use mainly interaction-specific information to decode relational emotions such as affection. We further found that intrapersonal postural parameters such as limb angles and interpersonal balance parameters such as motion-energy balance and IPO balance show the strongest relation to the valence percept.
Recently, de Gelder and Poyo Solanas have proposed a framework in which perceptually relevant information from bodies via movement and posture is coded in the brain through midlevel features such as limb contraction and head-to-hand distance 43 . Our results support the importance of these midlevel features and add computational interaction-specific parameters to their framework. The present data show that the emotional content of a scene is characterized by midlevel features such as velocity or motion-energy balance. For example, happy interactions are characterized by higher velocity profiles than affection and sadness, but not higher than anger. These findings are broadly consistent with those reported in the existing literature 2,3,19,20,22 . Affectional and sad interactions show a high degree of similarity regarding their intrapersonal kinematic and postural parameters. These emotions, however, reveal characteristic differences on the interpersonal level (e.g., IPO, IPO Balance, IPD STD, personal space).
Regarding emotion perception, our findings show an association to characteristic body expressions on both the intra-and the interpersonal level. Representational similarity analysis reveals that vertical movement, IPO (average & orientation), and motion-energy balance are best suited to explain emotion perception. In contrast to some research reports 12,24,36,44 , we were unable to distinguish emotional categories via postural features such as limb angles and limb contraction. Here, it has to be taken into account that most former studies used stimuli depicting a single person mainly in a frontal view and not social interactions observed from a third-person perspective as in the present study. The present data show that participants confused happiness with anger, although only to a small extent. Conversely, anger trials were more often confused with sadness than with happiness. Most often, affection stimuli were confused with happiness. A study investigating emotions in gait 3 has demonstrated that confusions occur preferentially between emotions that share a similar level of movement activation: angry gaits tend to be confused with happy gaits, and sad gaits with fearful ones. Thus, these authors concluded that velocity is particularly important for the perception and expression of emotions 3,20,22 . Our findings also suggest that velocity of movements is important in the process of emotion recognition. However, velocity is not sufficient to distinguish between emotions such as anger and happiness, especially within social interactions where interpersonal cues such as proxemics or balance are available for the observer. Interpersonal cues such as motion-energy balance between two agents allow a perceptual distinction between happiness and anger. Motionenergy balance explains (1) the high degree of confusion between happiness and affection and (2) the low degree of confusion between anger and happiness when social information is available. Motion-energy balance within interactions, therefore, seems to be an important property for the observer to generate an emotional percept. Hence, social context information is particularly important for recognizing emotional content, especially when the depicted emotions depend more on reciprocal interactions (e.g., affection) 10,45 . The present results provide a computational framework for this observation. For example, affection differs from other emotions only regarding its interpersonal movement characteristics. This is underpinned by the calculated classification trees: the intrapersonal model is less accurate than the interpersonal model, underlining that emotions such as affection have a strong interpersonal character and that the spatiotemporal coupling of two moving agents seems to be of great significance especially for perceiving socially expressible emotions 10,17 .
Besides emotion recognition, we were interested in the perceived emotional valence-a dimension that reflects the subjective impression of a scene related to approach-avoidance tendencies 46 . Our data reveal that on the intrapersonal level, postural features such as limb angles best explain the participants' valence perception. Regarding interpersonal features, motion-energy balance and orientation between interacting people are the best predictors of perceived valence.
Finally, we observed a noteworthy, albeit not significant, trend towards a synchronization of velocity profiles, indicating that higher synchronization between people is associated with a positive impression of the perceived interaction. A study investigating interpersonal behaviour in a social task has shown that patterns of proxemic behaviours and interpersonal distance predicted the subjective quality of interactions 28 . Thus, balance and spatiotemporal harmony are predictors for both the experienced and the observed quality of an interaction.
Interestingly, our RSA results show that emotion category recognition is better predicted by kinematic features, whereas valence perception is related more to postural features of the stimuli. Basically, human emotions can be conceptualized within a two-dimensional model comprised of emotional valence (the subjective valuei.e., positive vs negative) and arousal (intensity) 47,48 . The present results reveal that emotions possessing the same valence (e.g., anger and sadness) are more similar in terms of the actors' postural features. Further, we observed that emotions that differ in terms of their valence but are similar in terms of their intensity (e.g. happiness and anger) resemble each other regarding their kinematics. Thus, one might assume that postural features might be more likely to reflect the valence and kinematic features might be more likely to reflect the arousal or intensity of the presented stimuli.
Altogether, we found a set of EBL features that characterizes emotional content and predicts the perception of the emotional quality of human interactions. These features are defined on an intra-and interpersonal level and include kinematic, and postural characteristics as well as proximity, balance, and synchronization. We conclude that the perception of human emotional interactions is a function of not only inherent kinematics of the agent but also interpersonal balance and proximity between agents. www.nature.com/scientificreports/ Limitations and future implications. It should be noted that the present and comparable studies differ with respect to the stimulus material used, stimulus length, emotional content, contextual information, and feature calculation 17,24 . These differences explain the partly heterogeneous results on emotion perception. Despite this heterogeneity, perception and recognition of emotional content are robust regardless of the stimulus material used. Thus, humans seem to weigh the relative importance of different movement features flexibly depending on the specific stimulus properties presented to them. We have to acknowledge that neither an intrapersonal nor an interpersonal feature correlates with the perceptual performance on the noise ceiling level, and that we found only weak positive correlations in the present study 24 . One reason for this may be that many features are similarly pronounced in different emotion categories. For example, happiness and anger are characterized by similar velocities. Hence, it would seem appropriate to develop models that integrate multiple feature dimensions of the observed scene. First solutions are offered by the present attempt to use a combination of features to classify the emotional content as well as to predict the emotional percept. Future studies, however, might apply more ecologically valid stimuli and combine different features in a multidimensional space in order to phenotype emotion specific properties of EBL in social interactions. Such approaches that aim to decode emotional human states from a combination of nonverbal signals on multiple levels are highly relevant in the context of human-robot interaction in order to ensure natural communication [47][48][49][50] .

Data availability
The datasets used and analyzed during the current study are available from the corresponding author on reasonable request. The source code is available at https:// zenodo. org/ record/ 47645 52#. YiXYK i9Xb0p (MATLAB).